*NOTE ///
This code shows how we coded raw external data. Users can skip this step and use this file ///
"combined_external_data.dta" but will still need to create religious diversity using the raw GFS data (code available below).

*WHO mortality data, download data from WHO
import delimited "adult_mortality_rate_15_60.csv", clear
*Adult mortality rate (probability of dying between 15 and 60 years per 1000 population)
ren factvaluenumeric mort_rate
keep if dim1=="Both sexes"
ren location country_name
ren spatial iso
ren parentlocation continent
keep if islatestyear=="true"
ren period year
keep mort_rate country_name continent iso year
label var mort_rate "Adult mortality rate (probability of dying between 15 and 60 years per 1000 population)"
egen zmort_rate=std(mort_rate)
label var zmort_rate "Z-score of adult mortality rate" 
*(probability of dying between 15 and 60 years per 1000 population)"
save "cleaned_who_mortality.dta", replace

**World Values Survey, download data from WVS
import delimited "\wvs_cultural_map.csv", clear
replace country_year=subinstr(country_year, ")", "", .)
split country_year , parse("(") gen(country_name)
drop country_year
ren country_name1 country_name
ren country_name2 year
destring year, replace
bysort country_name: egen my=max(year)
replace country_name=strrtrim(country_name)
replace country_name="United States" if country_name=="United States "
keep if my==year
drop s025
drop my
tab country_name
replace country_name="United Kingdom" if country_name=="Great Britain"
replace country_name="Hong Kong" if country_name=="Hong Kong SAR"
egen zwvs_traditional=std(wvs_traditional)
label var wvs_traditional "Country level index of secular over traditional values"
label var zwvs_traditional "Country level index of secular over traditional values"
label var wvs_survival "Country level index of self-expression over survival values"
cor wvs_traditional wvs_survival welzel_secular welzel_emancipative
save "cleaned_wvs_tradition.dta", replace


**Women's peace and security index, download data (see text for information)
import delimited "women_peace_index.csv", clear
drop rank
ren country country_name
foreach x in education employment fin_inclusion cell_phone parliament no_legal_discrimination justice mortality_ratio son_bias partner_violence safety terrorism conflict {
ren `x' wps_`x'
}
drop v17 v18
drop if country_name==""
replace country_name="Turkey" if country_name=="Türkiye"
destring wps_index, replace
foreach x in wps_partner_violence wps_no_legal_discrimination wps_index wps_justice wps_terrorism wps_conflict {
egen z`x'=std(`x')
}
label var wps_index "Women's Peace and Security Index"
label var zwps_index "Z-score of Women's Peace and Security Index"
label var wps_terrorism "political violence targetting women"
label var wps_partner_violence "Intimate partner violence victimization rate for women"
label var wps_no_legal_discrimination "Absence of legal discrimination"
label var wps_justice "extent to which women can pursue legal remedies to defend rights"
save "womens_rights_index.dta", replace

**World Bank
import delimited "GDP_per_capita_PPP_adj2017.csv", clear
ren country country_name
replace country_name="Egypt" if iso=="EGY"
replace country_name="Turkey" if iso=="TUR"
replace country_name="Hong Kong" if iso=="HKG"
drop ind*
gen lngdp=ln(gdp_pc_max)
label var lngdp "Natural log of GDP per capita in PPP dollars"
save "World_Bank\development.dta", replace


*V-DEM
use "VDem\V-Dem-CY-Core_STATA_v13\V-Dem-CY-Core-v13.dta", clear
keep if year==2022
keep country_name country_id v2x_polyarchy v2x_partipdem v2x_egaldem v2x_freexp_altinf v2x_suffr v2xel_frefair
sort country_name
replace country_name="United States" if country_name=="United States of America"
replace country_name=strrtrim(country_name)
replace country_name=strltrim(country_name)
egen zv2x_polyarchy=std(v2x_polyarchy)
label var v2x_polyarchy "Z-score of electoral democracy index"
save "democracy_index.dta", replace

*Create macro-level measures of religious diversity from Global Flourishing Survey
use "${clean_data}\gfs_cleaned_merged_database.dta", clear

rename * , lower
ren annual_weight1  weight

gen self_not_relig=0 if rel2>=1 & rel2<=96
replace self_not_relig=1 if rel2==97

gen sect=rel2
replace sect=0 if self_not_relig==1
replace sect=. if rel2>96
replace sect=. if rel2<0
tab sect, gen(sect)

collapse (mean) sect1 sect2 sect3 sect4 sect5 sect6 sect7 sect8 sect9 sect10 sect11 sect12 sect13 sect14 sect15 sect16 [aw=weight], by(country)

forval i=1/16 {
	gen sect`i'_sq=(sect`i')^2
}
egen SUM=rowtotal(sect1_sq sect2_sq sect3_sq sect4_sq sect5_sq sect6_sq sect7_sq sect8_sq sect9_sq sect10_sq sect11_sq sect12_sq sect13_sq sect14_sq sect15_sq sect16_sq)

*Generate diversity index
gen DIV=1-SUM
sum DIV
gsort -DIV
edit
keep country DIV
label var DIV "Religious diveristy index"
save "religious_diversity_by_country.dta", replace

**Combine all external/country-level data

**V-DEM
use "G:\Survey_Methodology\People\Family Study\Global Flourishing Study\external data\democracy_index.dta", clear

**Women's rights index
merge m:1 country_name using "G:\Survey_Methodology\People\Family Study\Global Flourishing Study\external data\womens_rights_index.dta"
drop if _merge==2
drop _merge

**GDP per capita, PPP
merge m:1 country_name using  "G:\Survey_Methodology\People\Family Study\Global Flourishing Study\external data\World_Bank\development.dta"
drop if _merge==2
drop _merge

**Tradition
merge m:1 country_name using "G:\Survey_Methodology\People\Family Study\Global Flourishing Study\external data\WVS\cleaned_wvs_tradition.dta"
drop if _merge==2
drop _merge

**Religious Diversity (derived from GFS)
*note this is omitted from file we provide to comply with data access restrictions. Create using above code.

**WHO mortality--hong kong is missing
merge m:1 iso using "G:\Survey_Methodology\People\Family Study\Global Flourishing Study\external data\WVS\cleaned_who_mortality.dta"
drop if _merge==2
drop _merge

global country_level_controls "v2x_polyarchy zwvs_traditional zwps_index lngdp"
global country_valid "zmort_rate"

save "{clean_data}\combined_external_data.dta", replace

/*
Notes on V-DEM from website:
2 V-Dem Democracy Indices
2.1 V-Dem High-Level Democracy Indices
This section groups together macro-level indices that describe features of democracy at the highest
(most abstract) level. Please see Appendix A for an overview of all indices, component-indices, and
lower-level indices.

2.1.1 Electoral democracy index (D) (v2x_polyarchy)
Project Manager(s): Jan Teorell
Additional versions: *_codelow, *_codehigh, *_sd
Question: To what extent is the ideal of electoral democracy in its fullest sense achieved?
Clarification: The electoral principle of democracy seeks to embody the core value of making rulers
responsive to citizens, achieved through electoral competition for the electorate's approval
under circumstances when suffrage is extensive; political and civil society organizations can
operate freely; elections are clean and not marred by fraud or systematic irregularities; and
elections affect the composition of the chief executive of the country. In between elections,
there is freedom of expression and an independent media capable of presenting alternative
views on matters of political relevance. In the V-Dem conceptual scheme, electoral democracy
is understood as an essential element of any other conception of representative democracy —
liberal, participatory, deliberative, egalitarian, or some other.
Scale: Interval, from low to high (0-1).
Source(s): v2x_freexp_altinf v2x_frassoc_thick v2x_suffr v2xel_frefair v2x_elecoff
Data release: 1-13. Release 1-5 used a different, preliminary aggregation formula.
Aggregation: The index is formed by taking the average of, on the one hand, the weighted average
of the indices measuring freedom of association thick (v2x_frassoc_thick), clean elections
(v2xel_frefair), freedom of expression (v2x_freexp_altinf), elected officials (v2x_elecoff), and
suffrage (v2x_suffr) and, on the other, the five-way multiplicative interaction between those
indices. This is half way between a straight average and strict multiplication, meaning the
average of the two. It is thus a compromise between the two most well known aggregation
formulas in the literature, both allowing partial "compensation" in one sub-component for
lack of polyarchy in the others, but also punishing countries not strong in one sub-component
according to the "weakest link" argument. The aggregation is done at the level of Dahl's subcomponents
with the one exception of the non-electoral component. The index is aggregated
using this formula:
v2x_polyarchy = .5  MPI + .5  API
*/